Integrate a New Technology in an Existing Crawler Category

In addition to developing a new Crawler, there will be a case where we will need to extend the functionality of an existing crawler. For example we have crawler for RDBMS systems, where metadata and the data are fetched from RDBMS systems. Though this is a generic crawler and supports a wide range of databases, there will be some database technologies that may not fit into the existing framework. There might be some slight changes in metadata format etc. For such cases it is necessary to enhance the existing crawler to support new or different flavors of databases.

The section below provides details of technical changes required in case of adding new database technology. You must make specific changes to microservices for adding a new technology to an existing category like RDBMS.

Let us consider the example of MariaDB.

Class/Interface/DB Change Purpose
Microservice: plf-elab-service
com.calibo.platform.elab.enums.ProviderEnum
Copy
public enum ProviderEnum 


MYSQL, MS_SQL_SERVER, ORACLE, POSTGRE_SQL, MARIA_SQL 

}
Add new enum type for the tech stack. This is free text constant which needs to be unique.
com.calibo.platform.elab.enums. DatastoreDriverEnum

MYSQL

Add a new entry like the sample below.

Copy
MYSQL("com.mysql.cj.jdbc.Driver", "jdbc:mysql://%s:%s/%s?zeroDateTimeBehavior=convertToNull"),

Make an entry for MariaDB specific driver details.

Refer to MariaDB documentation and use specified drivers.

Microservice: plf-configuration-service

DB Changes

DB Name – configuration

Table name - SETTING

DB Script

Copy
 INSERT INTO `setting` (`name`, `description`, `config_code`, `section`, `sub_section`, `config_version`, `selected`, `default`, `provider_code`, `logo`, `version`, `created_by`, `created_on`, `updated_by`, `updated_on`) VALUES 

(MariaDB, mariaDB is an open-source relational database management system.', 'DATA_STORES', NULL, NULL, NULL, false, false, MariaSQL, '/mariasql.png', 0, 'tdas@altimetrik.com', now(), 'tdas@altimetrik.com', now())
For new RDBMS type, a new entry has to be made in this table
com.calibo.platform.core.enums.ProviderEnum
Copy
public enum ProviderEnum 


MYSQL, MS_SQL_SERVER, ORACLE, POSTGRE_SQL, MARIA_SQL 

Add new enum type for the tech stack. This is free text constant which needs to be unique.
Microservice: plf-data-pipeline-designer
com.calibo.platform.dpd.util. RelationalDatabaseUtil

Method Name - public RelationalDatasourceGenericRepository getRelationalDatasourceRepository(ProviderEnum type)

Add new case into switch statement

Example:

Copy
switch (type) { 

case RDBMS: 

case MYSQL: 

case AWS_RDS_MYSQL: 

case AWS_S3: 

case POSTGRE_SQL: 

case AWS_RDS_POSTGRE_SQL: 

case AZURE_RDS_POSTGRE_SQL: 

case SNOWFLAKE: 

case MS_SQL_SERVER: 

case AWS_RDS_MARIA_DB: 

case ORACLE: 

case MARIA_SQL: // This is new 

return rdsOrchestrationService; 

}
Map MariaDB into the RDBMS specific service flow. This will map MariaDB as RDBMS type and it will further map MariaDB to RDBMS logic.
com.calibo.platform.dpd.repository.impl. RdsOrchestrationImpl

Method Name - private void

setDriverAndDriverName(RdsOrchestratorBean rdsOrchestratorBean, ProviderEnum subType)

Add the below case inside switch statement:

Copy
case MARIA_DB: 

 

rdsOrchestratorBean.setDriver(DatastoreDriverEnum.MARIA_DB.getDriverClass()); 

 

rdsOrchestratorBean.setDriverName(MARIADB); 

 

break;
Refer to com.calibo.platform.elab.enums. DatastoreDriverEnum Class where MariaDB driver has been assigned.
Microservice: plf-common-orchestrator-service
com.calibo.platform.common.util. RelationalDatasourceGenericRepository

Method Name- public RelationalDatasourceGenericRepository getRelationalDatasourceRepository(ProviderEnum type)

Inside Switch statement, modify the below case statement and add the additional mariadb related entry as mentioned below:

Copy
case RDBMS, MYSQL, AWS_RDS_MYSQL, MARIA_SQL -> genericMysqlRepository; 

 

Write new class implementing

MysqlRelationalDatasourceGenericRepository and RDBMSMetadataHelper

Implement below set of methods

  • GetTableList

  • GetJoinedPreview

  • GenerateQuery

  • GetTableDescriptionFromQuery

  • fetchMetadata

  • GetQueryData

  • GetConnection

  • getConnectionOfDataSource

  • getMetadata

Implement new class for technology-specific integration.
Com.calibo.platform.common.constants. DataSourceConstants

Below methods have the SQLs related to DB.

Identify which case the SQL belongs to. Alter the case and add additional entry, else create new case.

Methods:

  • getSelectQuery

  • getTablesQuery

  • InfoSchemaQuery

Copy


 

case MYSQL, MARIADB -> 

"SELECT TABLE_NAME as tablename FROM information_schema.tables where table_schema = ?";

These methods have the SQLs defined for certain operations. Map MariaDB to specific case so that it is mapped to certain SQL. If none of the cases match MariaDB may be due to nature of the query, then add new case and map the query.

 

Related Topics Link IconRecommended Topics

What's next? Source Code Repository Adapters